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Abstract 

Background: It is suspected that early gastric carcinoma (GC) is a dormant variant that rarely progresses to 
advanced GC. We demonstrated that the dormant and aggressive variants of tubular adenocarcinomas (TUBs) of 
the stomach are characterized by loss of MYC and gain of TP53 and gain of MYC and/or loss of TP53, respectively. 
The aim of this study is to determine whether this is also the case in undifferentiated-type GCs (UGCs) of different 
genetic lineages: one with a layered structure (LS+), derived from early signet ring cell carcinomas (SIGs), and the 
other, mostly poorly differentiated adenocarcinomas, without LS but with a minor tubular component (TC), 
dedifferentiated from TUBs (LS-/TC+). 

Methods: Using 29 surgically resected stomachs with 9 intramucosal and 20 invasive UGCs (11 LS+and 9 LS-/TC+), 
63 genomic DNA samples of mucosal and invasive parts and corresponding reference DNAs were prepared from 
formalin-fixed, paraffin-embedded tissues with laser microdissection, and were subjected to array-based 
comparative genomic hybridization (aCGH), using 60K microarrays, and subsequent unsupervised, hierarchical 
clustering. Of 979 cancer-related genes assessed, we selected genes with mean copy numbers significantly different 
between the two major clusters. 

Results: Based on similarity in genomic copy-number profile, the 63 samples were classified into two major 
clusters. Clusters A and B, which were rich in LS+ UGC and LS-/TC+ UGC, respectively, were discriminated on the 
basis of 40 genes. The aggressive pattern was more frequently detected in LS-/TC+ UGCs, (20/26; 77%), than in LS+ 
UGCs (17/37; 46%; P = 0.01 95), whereas no dormant pattern was detected in any of the UGC samples. 

Conclusions: In contrast to TUBs, copy number alterations of MVCand TP53 exhibited an aggressive pattern in LS+ 
SIG at early and advanced stages, indicating that early LS+ UGCs inevitably progress to an advanced GC. Cluster B 
(enriched in LS-7TC+) exhibited more frequent gain of driver genes and a more frequent aggressive pattern than 
cluster A, suggesting potentially worse prognosis in UGCs of cluster B. 



Background 

Gastric carcinoma (GC) have been classified histologi- 
cally into intestinal, diffuse and unclassified types by 
Lauren [1] and the unclassified type was further divided 
into solid and mixed types by Carneiro [2]. The 
undifferentiated-type gastric carcinoma (UGC) accor- 
ding to the Japanese classification [3] mostly overlaps 
poorly differentiated GC, which comprises not only the 
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diffuse type including signet ring cell carcinoma (SIG) 
but also the solid type and the mixed type with minor 
tubular component (TC). 

Recently it has been proposed that advanced diffuse- 
type GC may derive from either early diffuse-type or 
intestinal- type GC. Well differentiated tubular adeno- 
carcinoma (TUB) can transform into poorly differenti- 
ated adenocarcinoma (POR) after the silencing of cell 
adhesion-related genes including CDH1 [4,5]. Carneiro s 
mixed type carcinomas may thus overlap dedifferentiated 
TUBs. It has been reported that the survival rate of the pa- 
tients with mixed-type GCs was significantly lower than 
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that of the patients with GCs of other types [2], whereas 
the survival rate of early GC patients with SIG was 
higher than that of GC patients without SIG [6]. Thus 
UGCs may be divided into subgroups with different 
prognosis. Recently, a mass-screening program for 
neuroblastomas [7-9] was suspended in Japan because 
a discontinuous genetic lineage was observed between 
the early- and the late-presenting neuroblastomas. Ne- 
gative and late-presenting (>1 year) neuroblastomas 
exhibited near-diploidy with terminal lp deletion, 
whereas positive neuroblastomas in infants exhibited 
near-triploidy without lp deletion [10,11]. To perform 
such subgrouping, we have classified UGCs based on 
the continuity of genetic lineages as well as the expres- 
sion of morphological lineage markers. 

Our lineage analysis using chromosomal comparative 
genomic hybridization (CGH) was based on distinctive 
morphological lineage markers. A layered structure 
(LS) represents an incipient phase of SIG development 
[12] and is commonly retained even at an advanced 
stage in the human stomach. In tumour regions with 
LS, the mode of cell proliferation resembles that in the 
normal gastric mucosa. And it is believed that tumour 
cells remain confined to the mucosa as far as they grow 
to form the LS [13]. Our lineage analyses confirmed 
that POR with LS was derived from intramucosal SIG, 
whereas POR without LS and with a minor TC (< 30%), 
was derived from TC [14,15]. However, the TC was not 
always derived from early TUB but could also be 
derived from SIG, whereas LS was scarcely derived 
from TUB [15]. Therefore, as a morphological lineage 
marker, LS may take priority over TC. In addition, 
UGCs without LS or TC due to secondary loss of these 
markers are observed, which prompted us to adopt 
array CGH (aCGH) and unsupervised cluster analyses 
of the aCGH data to classify UGCs solely on the basis 
of similarity in the genomic copy number profile. 

In differentiated-type gastric carcinomas (DGCs), our 
recent aCGH-based lineage analyses revealed two genetic 
lineages: one with copy-number loss of MYC and copy- 
number gain of TP53 (MFC- and 7P53+), a dormant 
pattern, and the other with the copy-number gain of MYC 
and/or copy-number loss of TP53 (MYC+ and/or TP53-), 
an aggressive pattern. The dormant pattern accounted 
for 70% of intramucosal carcinoma samples and a half 
of the intramucosal part samples of invasive carcin- 
omas. The invasive parts of invasive carcinomas mostly 
exhibited the aggressive copy number alteration (CNA) 
pattern. When the intramucosal part of an advanced 
cancer was dormant, the lineage was discontinuous be- 
tween the mucosal and invasive parts. Therefore, the 
MYC-/TP53+ and MYC+ and/or TP53- CNA patterns 
may be signatures of dormant and aggressive TUBs, 
respectively [16]. 



In the present study, genomic DNA samples from the 
mucosal and invasive parts of early and advanced UGCs 
were prepared and subjected to gene copy-number ana- 
lyses using aCGH, followed by unsupervised cluster ana- 
lysis of the aCGH data. Based on these results, we 
examined relationship between morphological and gen- 
etic lineage markers and identified several useful lineage 
marker genes for UGC. 

Methods 

The Institutional Review Board on Medical Ethics at 
Shiga University of Medical Science approved this study 
on the condition that the UGC samples used were an- 
onymous. Written informed consent was not required 
because this retrospective study used archival samples. 

Tissue samples 

This study included 29 surgically resected, buffered 
formalin-fixed, paraffin embedded UGCs: 20 with LS in 
at least part of the tumour (LS+, 9 intramucosaltumours 
and 11 invasive tumours) and 9 without LS but con- 
taining a small TC (LS-/TC+, all invasive tumours) 
(Table 1). TC was defined as a well or moderately differ- 
entiated adenocarcinoma component comprising < 30% of 
the entire tumour [15]. All samples were selected from 
GC cases diagnosed in our department from 1997 to 
2011. Intramucosal LS+UGC patients averaged 57.6 years 
of age (range, 48-79) and patients with invasive LS+ 
UGCs 60.2 years (range, 48-79) and patients with inva- 
sive LS-/TC+ UGCs 62.2 years (range; 50-75). The 
macroscopic classification was determined according 
to the Japanese Classification of Gastric Cancer with 
TNM staging [3]. 

LS evaluation 

LS was defined as in a previous study [17]. In brief, 
LS+ regions had small carcinoma cells confined to the 
stroma at the gland-neck level that gradually differenti- 
ated to signet ring cells in the superficial (and deep) 
lamina propria (Figure la). The absence of LS in 
intramucosal regions of the tumour was defined by 
four patterns: 1) contact of small carcinoma cells to 
the muscularis mucosae in SIG, 2) mucinous adeno- 
carcinoma, 3) POR and 4) the presence of a TC 
(Figure lb-f). 

Laser microdissection and DNA preparation 

Tumour tissue samples were obtained from 5-|im-thick 
tissue sections using a LMD6000 laser microdissection 
system (Leica Microsystems, Wetzlar, Germany). For in- 
vasive cancers, DNA samples were obtained from both 
the intramucosal and invasive parts. For each sample, 
cancer tissues were obtained from an area >6 mm 2 , in 
which cancer cells accounted for >70% of the total cell 
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Table 1 Summary of clinicopathological characteristics of 29 UGCs 



Case 
no 


Age/ 
sex 


Size of 
mucosal 
lesion 

(cm) 


Macroscopic 
type* 


Histological type* 


Sampling region for aCGH 
Intramucosal part Invasive 
LS notLS TC part 


Depth of 
invasiont 


LN 
metat 


Staget 


M101 


79/F 


8.5x4.0 


0(Nc) 


SIG>TC 


+ 


NT 


NT 




T1 (m) 


NO 


IA 


M102 


48/F 


9.5x5.0 


0(Nc) 


SIG > POR1 


+ 


NT 


- 




T1 (m) 


NO 


IA 


M103 


57/M 


1 .4 x 0.8 


0(llc) 


SIG 


+ 


NT 


- 




T1 (m) 


NO 


IA 


M104 


76/F 


6.0x5.0 


0(Nc) 


SIG 


+ 


NT 


- 




T1 (m) 


NO 


IA 


M105 


50/M 


1 .5 x 1 .2 


0(Nc) 


SIG 


+ 


NT 


- 




T1 (m) 


NO 


IA 


M106 


60/F 


1 .2 x 1 .0 


0(llc) 


SIG 


+ 


- 


- 




T1 (m) 


NO 


IA 


M107 


49/F 


4.0 x 2.5 


0([lc+[[[) 


SIG > POR1 


+ 


NT 


- 




T1 (m) 


NO 


IA 


M108 


48/F 


6.0 x 2.8 


0(llc) 


SIG > POR1 >TC 


+ 


POR 


NT 




T1 (m) 


NO 


IA 


M109 


51/M 


5.3x3.3 


0(llc+lll) 


SIG 


+ 


SIG 


- 




T1 (m) 


NO 


IA 


SM101 


71/F 


0.9 x 0.8 


0(Nc) 


SIG > POR2 


+ 


NT 


- 


NT 


T1 (sm2) 


N2 


II 


A102 


72/F 


5.0x3.0 


0(llc + llb) 


POR2 > POR1 > SIG 


+ 


SIG 


- 


POR2 


T2 (mp) 


N1 


II 


A103 


79/F 


12.0x8.5 


0(lla + llb) 


POR1 >TC>SIG>POR2 


+ 


POR 


Nl 


NT 


T2 (mp) 


N1 


II 


A104 


49/M 


2.8 x 2.5 


0([[c+lll) 


SIG > POR2 > TC 


+ 


NT 


NT 


POR2 


T2 (ss) 


N1 


II 


SM105 


59/F 


1 1 .5 x 7.0 


0(lla + llc) 


SIG>TC>POR2 


+ 


TUB2 


NT 


POR2 


T1 (sm) 


N1 


IB 


SM106 


72/M 


3.7x2.3 


0(llc) 


SIG > POR1 


+ 


POR 


- 


NT 


T1 (sm2) 


NO 


IA 


A107 


48/F 


12.0x6.5 


0([[c+lll) 


SIG>POR2>POR1 >MUC 


+ 


POR 


- 


POR2 


T3 (se) 


N2 


IIIB 


A108 


46/M 


4.0 x 2.8 


0(llc+ III) 


POR2>POR1 > SIG 


Nl 


Nl 


- 


POR2 


T2 (mp) 


N2 


IIIA 


A109 


55/M 


3.8x3.3 


0(llc+lll) 


SIG > POR1 


+ 


POR 


- 


SIG 


T1 (sm2) 


NO 


IA 


Al 10 


57/M 


5.5x2.2 


0(llc) 


POR2 > SIG > TC 


+ 


POR 


- 


POR2 


T2 (mp) 


N1 


II 


Al 1 1 


54/F 


8.0 x 7.0 


3 


POR2 > SIG 


+ 


POR 




POR2 


T3 (se) 


N2 


IIIB 


SM201 


75/F 


3.7x3.0 


2 


POR1 >TC 




POR/TUB2 


+ 


POR 


T1 (sm2) 


NO 


IA 


A202 


60/M 


4.0 x 3.8 


0(llc) 


POR2>POR1 >TC>SIG 




POR/TUB2 


+ 


POR 


T2 (mp) 


NO 


IB 


SM203 


71/M 


4.5 x 2.0 


0(lla + llc) 


POR1 >TC>SIG 




POR/TUB2 


+ 


POR 


T1 (sm2) 


NO 


IA 


A204 


65/F 


5.5x3.0 


3 


POR2>TC>POR1 




POR/TUB2 


+ 


POR 


T3 (se) 


N3 


IV 


A205 


54/M 


7.4x5.8 


5 


POR2>TC>POR1 




POR/TUB2 


+ 


POR 


T3 (se) 


NO 


II 


A206 


67/F 


5.5x4.0 


4 


POR2>TC>POR1 




POR/RJB2 


+ 


NT 


T4 (si) 


N2 


IV 


A207 


52/M 


6.0x4.0 


4 


POR1 >POR2>SIG>TC 




POR/TUB2 


+ 


POR 


T3 (se) 


N3 


IV 


A208 


75/M 


9.0 x 7.0 


2 


POR1 >TC 




POR/TUB2 


+ 


POR 


T2 (mp) 


N1 


II 


A209 


50/F 


2.3 x 0.8 


3 


POR2 > SIG > POR1 >TC 




POR/TUB2 


+ 


POR 


T2 (mp) 


N2 


IIIA 



Japanese classification of gastric carcinoma, modified. 
tTNM classification. 

UGCs, Undifferentiated gastric carcinomas; aCGH, Array CGH; LN, Lymph node, LS, Layered structure; TC, Tubular component; SIG, Signet ring cell carcinoma; POR, 
Poorly differentiated adenocarcinoma; POR1, Solid POR; POR2, Non-solid POR; TUB2, Moderately differentiated adenocarcinoma; MUC, Mucinous adenocarcinoma; 
m, mucosa; sm, submucosa; mp, muscular propria; ss, subserora; se, serosal exposure; si, invasion to adjacent structures; M, Male; F, Female; NT, Not tested; Nl, 
Not informative. 



count. Tissue samples were digested in 200 (ig/ml pro- 
teinase K solution for approximately 72 hours at 37.0°C 
and genomic DNA extracted with phenol/chloroform. 

Whole genome amplification 

Sample DNA was amplified using the GenomePlex 
Whole Genome Amplification Kit (WGA2 Kit; Sigma, 
St. Louis, USA) [18]. For some DNA samples that could 
not be sufficiently amplified, the WGA5 Kit (Sigma) 
was employed. 



Array CGH 

An oligo CGH microarray (60K, 60-mer) (Agilent, 
Santa Clara, USA) was used in this study, according 
to the manufacturer's instructions. In brief, the amp- 
lified tumour and control DNA samples were non- 
enzymatically labelled with Cy5 and Cy3, respectively, 
using the Genome DNA ULS Labelling Kit (Agilent) 
and competitively hybridized to the microarray. The 
hybridized array images were captured using a DNA 
microarray scanner (Agilent) and then the fluorescence 
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Figure 1 Histological appearances of intramucosal parts of undifferentiated-type gastric carcinomas (UGCs). A signet ring cell carcinoma 
(SIG) component with a layered structure in case A107 (a). Small carcinoma cells are distributed in the deeper part just above or in the muscularis 
mucosae in a SIG component in case M109 (b). A mucinous adenocarcinoma component in case A107 (c). A poorly differentiated 
adenocarcinoma component in case SM106 (d). Minor tubular components in cases SM105 and SM201, respectively (e, f). 



intensity of the tumour and control at each probe dot 
was calculated by Feature Extraction Ver.9.5.3 (Agilent). 
The array data were normalized using Genomic Work- 
bench software Ver.5.0 (Agilent). The positions of olig- 
omers are based on the Human Genome February 
2009 assembly (hgl9). Copy-number gains and losses 
were defined as changes in the logarithm to the base 2 
of the tumour to reference signal intensity ratio (T/R) 
greater than 0.3219 and less than -0.3219, respectively. 



Cluster analysis 

To perform novel subtyping of UGC samples based on 
genomic profile similarity in this study, an unsuper- 
vised hierarchical cluster analysis was applied across 63 
samples from 29 UGC cases by using the Cluster 3.0 
and TreeView software programs. The clustering algo- 
rithm was set to complete linkage clustering using an 
uncentered correlation. To enable unsupervised cluster 
analysis, we performed unbiased reduction in probe 
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Figure 2 Frequency of copy-number alterations at the chromosome level. The percentage of the samples that have CNAs for each 
chromosome in the LS+ UGCs (a) and LS-^C+ UGCs (b). Gains and losses are indicated with red and green, respectively. 
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Figure 3 Unsupervised hierarchical cluster analysis of array-based comparative genomic hybridization (aCGH) data. Gene copy number 
gains and losses are indicated by red and green, respectively. A total of 63 samples from 29 UGCs were classified into two major clusters: A and 
B. Most samples of LS+ UGCs were included in cluster A and most LS-^C+ UGCs samples were in cluster B. All the Intramucosal LS+ UGCs were 
included in cluster A. 



number from around 60,000 to several thousands of 
probes. For this purpose, we selected large genes be- 
cause the greater number of corresponding probes 
resulted in improved signal-to-noise ratio of the repre- 
sentative gene copy numbers. The unsupervised strat- 
egy enabled us to set an internal standard to validate 
clustering results; the copy number profiles in samples 
of the same tumour should be more similar than any 
copy number profiles from another tumour because 
the gene alterations in the process of carcinogenesis 
are largely common among the samples from the same 
tumour. 

Statistical analyses 

Differences in contingency tables were assessed for statis- 
tical significance using Fisher's exact test. A P < 0.05 (2- 
sided) was considered statistically significant. The Welch's 
t test was used to evaluate the difference in mean DNA 
copy number for each probe between two clusters of sam- 
ples. The Bonferroni correction was used to correct for 
multiple comparisons. 



Results 

Samples analysed with array CGH 

Tissue samples were excised from 29 archived GC speci- 
mens by laser microdissection. The tissue sample popu- 
lation included 11 regions (from 9 intramucosal SIGs), 
of which 9 regions were LS+ and the other two LS-, 26 
regions (from 11 LS+ invasive UGC), of which 10 were 
LS+ mucosal regions, 8 were LS- mucosal regions and 8 
were invasive regions, and 26 regions (from 9 LS-/TC+ 
invasive UGCs): 9 intramucosal POR, 9 intramucosal TC 
and 8 invasive regions. 

Genome wide copy number alterations 

A plot of the genetic aberration penetrance for all chromo- 
somes is shown for LS+UGCs and LS-/TC+UGCs in 
Figure 2a and Figure 2b, respectively. Copy-number gains 
and losses were more common in LS-/TC+ UGCs than in 
LS+UGCs. The most frequent copy-number gains were 
detected at 3q26 (7/63 samples), 5pl5 (8/63), 8p23 (9/63), 
8q24 (7/63) and 12pl2 (6/63), while the most frequent copy- 
number losses were found at 7q36 (5/63) and 12pl2 (5/63). 
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LS+ UGCs 



LS+ part of intramucosal LS+ UGCs 





M101 


M102 


M103 


M104 


M105 


M106 


M107 


M108 


M109 


TP53 


-0.34555 


0.08225 


-0.07102 


-0.42086 


-0.31831 


-0.15595 


-0.25034 


-0.05623 


-0.08278 


MYC 


0.28869 


0.60689 


-0.12499 


0.26935 


0.34773 


-0.01566 


0.03396 


0.28600 


-0.01045 



LS- part of intramucosal LS+ UGCs 





M108 


M109 


TP53 


-0.61834 


-0.89144 


MYC 


0.09889 


0.21814 



Mucosal LS+ part of invasive LS+ UGCs 





SM101 


SM105 


SM106 


SM109 


A102 


A103 


A104 


A107 


A110 


A111 


TP53 


-0.04180 


0.59827 


-0.50626 


-0.84266 


-0.02558 


0.29978 


0.24469 


-0.07716 


-0.08916 


0.02225 


MYC 


-0.2551 1 


0.01471 


0.08768 


0.33573 


0.00061 


-0.31085 


-0.04286 


0.33741 


-0.08619 


-0.19611 





Mucosal LS- part of invasive LS+ UGCs 












SM105TC SM106POR SM109POR 


A102MMC 


A103POR 


A107MMC 


A110POR 


A111MMC 


TP53 


0.36432 -0.26588 -0.39487 


-0.66527 


0.47621 


-0.03163 


-0.27518 


0.26744 


MYC 


-0.15383 0.38758 -0.17875 


0.02019 


-0.10465 


0.68949 


-0.05547 


0.89916 






Invasive POR part of LS+ UGCs 














SM105 SM109 A102 


A104 


A107 


A108 


A110 


A111 


TP53 


0.54888 -0.57843 0.39086 


-0.41720 


-0.24722 


0.43686 


0.12658 


0.46698 


MYC 


0.02869 -0.00150 0.18062 


-0.15054 


0.37732 


-0.17062 


0.22526 


0.02603 



LS-/TC+ UGCs 



Mucosal POR part of invasive LS-/TC+ UGCs 





SM201 


SM203 


A202 


A204 


A205 


A206 


A207 


A208 


A209 


TP53 


-0.12299 


-0.37356 


-0.06185 


-0.97802 


-0.51666 


-0.41075 


-0.16871 


-0.48922 


-0.44817 


MYC 


0.42723 


0.10365 


0.75227 


0.08942 


0.13256 


0.30174 


1.51943 


-0.05760 


-0.12850 



Mucosal TC part of invasive LS-/TC+ UGCs 





SM201 


SM203 


A202 


A204 


A205 


A206 


A207 


A208 


A209 


TP53 
MYC 


0.10594 
0.16098 


-1.32617 
0.51865 


0.28919 
1 .09902 


-0.11258 
0.52595 


-0.94866 
0.23809 


-0.50405 
-0.03043 


-0.70435 
0.02984 


-0.70839 
-0.11336 


0.17935 
-0.21899 



Invasive part of LS-/TC+ UGCs 





SM201POR 


SM203POR 


A202POR 


A204POR 


A205POR 


A206TC 


A207POR 


A208POR 


TP53 


-0.41347 


-0.18086 


0.33278 


-0.63912 


0.02259 


-0.16128 


-0.50111 


-0.23164 


MYC 


0.17736 


-0.00824 


0.43179 


0.14680 


0.08873 


-0.05205 


1 .74383 


-0.16978 



Figure 4 Array CGH data of MYC and TP53 in LS+ UGCs and LS-/TC+ UGCs. LS+ UGCs are divided into intramucosal cancers and invasive 
cancers. Numerals mean the base 2 logarithm of the test/reference signal intensity ratios of array CGH data. Significant gains and losses are 
indicated with red and green, respectively. The samples marked with and without grey margin are included in cluster B and cluster A, 
respectively, in Figure 3. 



Copy-number alterations (CNAs) common to all the 
samples from the same tumour were called stemline 
changes [14] and estimated to occur at the earliest stage of 
tumourigenesis and to be inherent into tumour lineage. 
Stemline gains of 3q26 were detected in 2/20 cases of in- 
vasive LS+ UGCs and none of invasive LS-/TC+ UGCs. 
In contrast, stemline gains of 5pl5, 8p23 and 12pl2 were 
detected in 2/9 cases of invasive LS-/TC+ UGCs but in 
no case of invasive LS+UGCs. No stemline losses were 
detected in any cases of UGCs. 

Previous studies using chromosomal or array CGH ana- 
lyses [19-27] reported that frequent CNAs in gastric 
cancers (common to both UGC and DGC) were chromo- 
somal gains at 3q, 5p, 7p, 8q, 13q, 17q, 20p and 20q, and 



losses at 4q, 5q, 6q, 9p, 17p, 18q and 21q. In the UGCs 
examined in the present study, all previously reported 
CNAs were observed except gains at 17q and 20p and 
losses at 5q and 6q. Gains at 8p and 12p were common in 
LS-/TC+ UGCs. Copy-number gains at 8q24 were com- 
mon in both types of UGCs, with 4/20 cases of LS+ UGCs 
and 3/9 cases of invasive LS-/TC+ UGCs, but these were 
not stemline changes. 

Impartial selection of genes reflecting the whole 
genome profile 

To classify UGC samples based on the overall similarity 
in the profile of gene copy number changes, we used un- 
supervised hierarchical cluster analysis. For this purpose, 
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Cluster A 



Cluster B 




KIAA1 804 
RAB9A 
DMD 
TERT 

TRh"i 
INSR 
."..TP I OB 
ATP I HP 
CDH2 



J Intramucosal LS+ UGCs Invasive LS+ UGCs Invasive LS-/TC+ UGCs 

Figure 5 Array CGH data of genes other than MYC and TP53 with significantly different T/R ratio between clusters A and B. UGCs are 
divided into clusters A and B that were determined in Figure 3. The heat map indicates the base 2 logarithm of the test/reference signal intensity 
ratios of array CGH data. Gains and losses are indicated with red and green, respectively. 



it was necessary to reduce the number of gene probes 
used in the cluster analysis from 60K to several thou- 
sands. The reduced number of genes should still reflect 
the whole genome profile if impartially selected. To fulfil 
these conditions, we selected genes based solely on the 
size of genes (the numbers of corresponding probes). 
After repeated trials of cluster analyses using genes of 
various minimum sizes (or probe numbers per gene), we 
observed that most CNAs from the same tumour were 
clustered more closely together than any samples from 
another UGC case when we analysed only genes with 3 
or more probes per gene: a total of 5019 genes. 

Classification of UGC using hierarchical cluster analysis 

We applied an unsupervised two-dimensional hierarch- 
ical clustering algorithm, to a total of the 63 DNA sam- 
ples from 29 UGCs. The samples were classified into 
two major clusters A and B, based on similarity in the 
genome profile (Figure 3). Of 63 samples, 30 LS+ UGCs 
were classified into cluster A and only 7 into cluster B. 
For LS-/TC+ UGCs, 8 samples were classified into clus- 
ter A and 18 into cluster B. All Intramucosal LS+ UGCs 
were included in cluster A. Clusters A and B had signifi- 
cantly different proportions of morphological subtypes 
(P = 0.0001). 

Copy number alterations of MYC and TP53 

Gains at 8q24 were common alterations for both LS+ 
and LS-/TC+ UGCs. The representative genes located at 



this locus is MYC Gains of MYC were detected in 2/11 
of Intramucosal LS+ UGCs (18.2%), 6/26 of LS+ UGC 
(23.1%) and 8/26 of invasive LS-/TC+ UGCs (30.1%). 
The aggressive pattern (MYC+ and/or TP53-) was de- 
tected in 6/11 of Intramucosal LS+UGCs (54.5%), 11/26 
of invasive LS+ UGCs (42.3%) and 20/26 of invasive LS-/ 
TC+UGCs (76.9%; Figure 4). Therefore, the aggressive 
pattern was more frequently detected in invasive LS-/ 
TC+UGCs than in LS+UGCs (P = 0.0195). The dor- 
mant pattern (MYC- and TP53+) was not detected in 
any of the UGC samples, even those from intramucosal 
GCs (Figure 4). 

Copy number alterations of genes other than 
MYC or TP53 

As mentioned above, 5pl5 was one of the most frequent 
gain sites in invasive LS-/TC+ UGCs (8/26; 30.7%), but 
was not detected in any of the 37 intramucosal and in- 
vasive LS+ UGCs (Figure 2). The target genes located at 
this locus may include the telomerase reverse transcrip- 
tase gene (TERT) because a TERT gain was more fre- 
quently detected in invasive LS-/TC+UGCs than 
intramucosal and invasive LS+UGCs (16/26 vs. 1/37, 
P < 0.0001) (Figure 5). In contrast, losses of TERT were 
detected in 4/37 samples of intramucosal and invasive 
LS+ UGCs (10.8%) but not in invasive LS-/TC+ UGCs. 

Welch's t test was performed to compare the mean T/ 
R ratio between the samples in cluster A and those in 
cluster B at each 2756 probe loci of 979 cancer-related 
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Table 2 List of 40 genes that have CNAs significantly different between clusters A and B 



Probe name 


Location 


Name of Gene 


Description 


P-value 


P value after 
Bonferroni correction 


A_ 


_16_ 


_P41 637097 


Apz 1 .Z 


HMD 


(Jybli upi III 1 


A ^41 F OR 


i 9 c i p OA 
1 .ZD 1 C Lr+ 


A_ 


J4_ 


_P1 33591 


5q21 -q22 


a or 
ArL 


adenomatous polyposis coli 


C ~7~7 A C HQ 


1 ._>y 1 L-U4 


A_ 


J4_ 


_P 130973 


Am 1 1 nil 

4q i i -q i z 


K// 


v-kit Hardy-Zuckerman 4 feline sarcoma viral 
oncogene homolog 


1 .1)4/ t-U/ 


1 QQfE OA 

z.ooot-U4 


A_ 


J4_ 


_P 125447 


13q12-q14 


SMAD9 


SMAD family member 9 


1.373E-07 


3.784E-04 


A_ 


J4_ 


_P 100439 


12q24.3 


*RAN 


member RAS oncogene family 


1.876E-07 


5.171E-04 


A_ 


J4_ 


_P102616 


20q1 1.2 


GDF5 


growth differentiation factor 5 


2.828E-07 


7.794E-04 


A_ 


J4_ 


_P 133647 


11q23.3 


**ETS1 


v-etserythroblastosis virus E26 oncogene homolog 1 (avian) 


5.603E-07 


0.0015 


A_ 


J4_ 


_P 138640 


5p14.3 


CDH18 


cadherin 18, type 2 


6.138E-07 


0.0017 


A_ 


J4_ 


_P 128664 


18q23 


ATP9B 


ATPase, class II, type 9B 


6.474E-07 


0.0018 


A_ 


J4_ 


_P 124801 


19p12 


PBX4 


pre-B-cell leukemia homeobox 4 


1.010E-06 


0.0028 


A_ 


J4_ 


_P 118423 


Xq28 


*RAB39B 


member RAS oncogene family 


1.573E-06 


0.0043 


A_ 


J4_ 


_P 134602 


17q11.2 


NF1 


neurofibromin 1 


1.802E-06 


0.0050 


A_ 


J4_ 


_P 120351 


5q31.1 


IL5 


interleukin 5 (colony-stimulating factor, eosinophil) 


1.988E-06 


0.0055 


A_ 


J4_ 


_P 125637 


6q16.1 


**EPHA7 


EPH receptor A7 


2.153E-06 


0.0059 


A_ 


J4_ 


_P 100300 


1 3n1 ?-n14 
i oq iz L| it 




D\v\r\U Idlllliy IllclllUfcrl y 


Z.DZDC UO 


n nn7n 

u.uu/ u 


A_ 


J4_ 


_P201681 


7p21 .1 


1 1 boo 


integrin, beta 8 


z.joyb-Uo 


o nn~7i 
U.UU/ I 


A_ 


J4_ 


_P1301 12 


Xp22.2 


*D A DC) A 

HAbyA 


RAB9A, member RAS oncogene family 


z.y4yh-uo 


U.UUo I 


A_ 


J4_ 


_P 120484 


i q4 1 


ppc fizn 


ribosomal protein S6 kinase, 52 kDa, polypeptide 1 


o n^lP Of 
o.UjzL-UO 


U.UUo4 


A_ 


J6_ 


_p 16709446 


An'\ 1 1 

4q i o. i 


Lrnnj 


crn receptor AAj 


O CCM.F Of 


n OOQQ 


A_ 


J4_ 


_P201127 


zqoz 


ULAz 


distal-less homeobox 2 


O Of 

o./Uot-Uo 


o m m 
U.U I Uz 


A_ 


J4_ 


_P 126957 


1 1 n 1 1 0 

I i p I I .Z 


jn / 


spleen focus forming virus (SFFV) proviral 
integration oncogene 


A OOA C Of 


n m 1 q 
u.u i i y 


A_ 


J4_ 


_P 104667 


8q22.2 


STK3 


serine/threonine kinase 3 


4.386E-06 


0.0121 


A_ 


J4_ 


_P 137889 


14q13.3 


NKX2-8 


NK2 homeobox 8 


5.393E-06 


0.0149 


A_ 


J4_ 


_P 109970 


i poo. i -po J 


rrnDZ 


Eph receptor B2 


0.0 j / t-UD 


O 01 83 
U.U I OJ 


A_ 


J4_ 


_P1 181 16 


Apz I .Z 


DA/in 
UIVIU 


dystrophin 


o.Uo/t-UO 


U.Uzzo 


A_ 


J6_ 


_P01 378894 


5q34 


ATD 7 HQ 

Air lub 


a i rase, class v, type i ub 


o. I OjL-UO 


U.Uzzj 


A_ 


J4_ 


_P 139456 


1 ~7ntlQ 1 

i /qzj. i 


KAdj/ 


member RAS oncogene family 


I .UU I t-Uj 


n oiif 
U.Uz/o 


A_ 


J4_ 


_P1 11361 


i / qz i .z 




KtrldllN JJD 


1 O^QF o^ 


0 09Q^ 

U.UZZJ 


A_ 


J4_ 


_P 134909 


i yp i o.o-p I D.Z 


/A/CD 


insulin receptor 


1.11 0E-05 


U.UoUo 


A_ 


J6_ 


_P02740008 


1 5n 1 1 

i oq 1 Z 


A TPQ A 1 

A 1 roAZ 


ATPase, aminophospholipid transporter, class I, 
type 8A, member 2 


i . i zyt-uj 


U.Uo I I 


A_ 


J4_ 


_P 102858 


1q42 


KIAA1804 


mixed lineage kinase 4 


1.155E-05 


0.0318 


A_ 


J4_ 


_P 113857 


12p13 


**ETV6 


ets variant 6 


1.172E-05 


0.0323 


A_ 


J4_ 


_P 115054 


16q22.3 


ZFHX3 


zinc finger homeobox 3 


1.180E-05 


0.0325 


A_ 


J4_ 


_P 138431 


1p32-p31 


R0R1 


receptor tyrosine kinase-like orphan receptor 1 


1.209 E-05 


0.0333 


A_ 


J8_ 


_P22746653 


3p25.3 


ATP2B2 


ATPase, Ca++ transporting, plasma membrane 2 


1.282E-05 


0.0353 


A_ 


J4_ 


_P1 05811 


11q13 


MEN! 


multiple endocrine neoplasia I 


1.318E-05 


0.0363 


A_ 


J4_ 


_P 136621 


18q1 1.2 


CDH2 


cadherin 2, type 1, N-cadherin (neuronal) 


1.420E-05 


0.0391 


A_ 


J4_ 


_P103176 


1 p34.3 


**EPHA10 


EPH receptor A10 


1.455E-05 


0.0401 


A_ 


J6_ 


_P 17370843 


5q34 


ATP 1 0B 


ATPase, class V, type 10B 


1.556E-05 


0.0429 


A_ 


J4_ 


_P108129 


5p15.33 


*TERT 


telomerase reverse transcriptase 


1.584E-05 


0.0436 
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Table 2 List of 40 genes that have CNAs significantly different between clusters A and B (Continued) 



A_ 


J 6_P 19750359 


13q12 


ATP8A2 


ATPase, aminophospholipid transporter, class I, 
type 8A, member 2 


1.586E-05 


0.0437 


A_ 


_14_P200005 


1p36 


ATP13A2 


ATPase type 1 3A2 


1.586E-05 


0.0437 


A_ 


_16_P01 183532 


5p15.2 


**TRI0 


trio Rho guanine nucleotide exchange factor 


1.786E-05 


0.0492 



In the column of gene name, "*" indicates genes related to tumour growth and "**" those related to invasion and metastasis. 



genes. Fourty-three gene probes, belonging to 40 genes, 
had significantly different mean T/R ratios between Clus- 
ters A and B at a level of P < 0.05 after Bonferroni cor- 
rection (Table 2). Of the 40 genes, 6 genes (KIT, RAN, 
RAB39B, RAB9A, RAB37 and TERT), including proto- 
oncogenes, have been implicated in enhanced tumour 
growth, and 8 genes (ETS1, SPI1, ETV6, EPHA7, EPHA5, 
EPHB2, EPHA10 and TRIO) in invasion/metastasis and 3 
genes (APC, NF1 and MEN!) in tumour suppression 
(Table 2). Most of log 2 T/R ratios of the 43 distinguishing 
gene probes were of opposite sign between clusters A and 
B, with greater in absolute values in cluster B (Figure 5). 

Discussion 

Based on chromosomal CGH analysis, we have reported 
that there are two distinct UGC lineages: the LS+ lineage 
derived from early SIG and LS-/TC+ lineage dedif- 
ferentiated from TUB [15]. The former is characterized 
by LS and the latter by a small TC. However, there are 
also UGCs without these morphological lineage markers. 
In the present study, we classified UGC based on simi- 
larity in the whole genome copy number profile among 
samples using unsupervised hierarchical cluster analysis 
and examined the correlation between this gene-based 
classification and morphological lineage markers. 

Using 5019 large genes and aCGH data from 63 DNA 
samples from 29 UGCs, we confirmed that most of the 
samples examined from the same tumour were clustered 
more closely together than in any other sample, thus ful- 
filling the criteria for our internal standard. On the basis 
of this observation, we performed an unsupervised two- 
dimensional hierarchical cluster analysis. All the samples 
were classified into two major clusters A and B (Figure 3). 
Cluster A was rich in LS+ UGCs, whereas cluster B was 
rich in LS-/TC+UGCs. This difference was statistically 
significant (P = 0.0001) and indicates that the classification 
by the presence or absence of LS and TC is well correlated 
with the genomic-profile-based classification and validates 
the LS+ and LS-/TC+ as lineage markers. 

All the intramucosal LS+ UGCs were included in cluster 
A, suggesting that most of UGCs in cluster A were 
derived from intramucosal SIG, and that the LS-/TC+ 
UGCs in cluster A may have secondarily lost LS. The 
LS-/TC+UGCs in cluster A may also be derived from 
SIG, as suggested by chromosomal CGH studies [15]. In 
contrast, LS in advanced LS+UGCs in cluster B (A107, 



A108 and A110) was virtually indistinguishable morpho- 
logically but showed genomic constitutions different from 
LS in cluster A. This may be a kind of phenocopy; a frac- 
tion of LS+ UGCs were considerably similar in genomic 
profile to LS-/TC+ UGCs. Although LS exhibits regular 
cell proliferation and differentiation and a superficially 
spreading dormant growth [13], it is suggested that LS it- 
self is not a marker of persistent tumour dormancy but 
has the potential to progress to an advanced stage with 
the prognosis as poor as that for LS-/TC+UGCs. This 
situation may resemble that in chronic myeloid leukaemia, 
in which blastic transformation occurs after a dormant 
phase of well retained cellular differentiation. 

Most UGCs exhibited the aggressive genomic pattern 
(TP53- and/or MYC+), even 55% of intramucosal LS+ 
UGCs, an incidence comparable to that in invasive UGCs. 
The dormant pattern (MYC- and TP53+) was not de- 
tected in any of the UGC samples, even in intramucosal 
UGCs. These intramucosal UGCs are distinct from early 
DGCs, in which 70% are of the dormant type [16]. There- 
fore, TPS3 and MYC are not as useful prognostic markers 
for UGCs. 

To explore other genes important for differentiation 
of genetic lineage and for UGC prognosis, we first com- 
pared the profiles of chromosomal copy-number alter- 
ations (CNAs) between LS+ and LS-/TC+ UGCs. As 
shown in Figure 2, CNAs detected in LS+ tumours but 
not in LS-/TC+ tumours, include 3q26 gain, a locus 
likely to include SKIL because the average SKIL copy 
number was greater in LS+ tumours than in LS-/TC+ 
tumours (P = 0.0060). SKIL encodes SnoN protein that 
is proto-oncogenic by antagonizing cytostatic responses 
of TGF-p [28,29] and anti-oncogenic by activating p53 
[30]. Those CNAs with the opposite pattern (present in 
LS-/TC+ tumours but not in LS+ tumours) were gains 
at 5pl5, 8p23 and 12pl2. The target genes at 5pl5 and 
12ql2 include TERT, and KRAS, respectively because 
gains of TERT, and KRAS were more frequently 
detected in invasive LS-/TC+ UGCs than intramucosal 
and invasive LS+ UGCs (P < 0.0001 and P = 0.0032, re- 
spectively). No target gene was detected at 8q23. 

Our second approach to identify lineage-specific CNAs 
was a screening of genes (from 979 cancer-related genes) 
that indicated significantly different mean T/R ratios be- 
tween the samples of clusters A and B. We selected 40 
genes that were significantly different between clusters 
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(using t test after Bonferroni correction), of which 6 were 
related to enhanced tumour growth and 8 to invasion/me- 
tastasis (Table 2). As shown in Figure 5, genes that drive 
tumourigenesis were more common in cluster B and 
showed larger amplitude CNAs. Thus, UGCs in cluster B 
may be more dependent on oncogenic genomic alterations 
and less on environmental and epigenetic alterations than 
those in cluster A. 

The possible drivers of tumour growth screened in- 
cluded KIT, TERT, and RAS family genes. KIT encodes a 
receptor tyrosine kinase that is activated by stem cell 
factor binding and initiates numerous signal transduc- 
tion pathways linked with the process of apoptosis, 
proliferation and tumorigenesis [31]. RAS family genes 
encode small GTPase that plays a key role in transduction of 
signals from receptor kinase to the pathways of various cellu- 
lar processes [32]. TERT encodes the telomerase catalytic 
subunit that plays not only an important role in cellular 
immortalization by telomere elongation [33,34] but also acti- 
vates cell proliferation [35]. The possible drivers of invasion 
and metastasis screened include ETS1 and Ephrin receptor 
genes. ETS1 encodes a transcription factor, Etsl proto- 
oncoprotein that promotes invasiveness and is an indicator 
of poor outcome in epithelial cancers through regulation of 
MMP1, MMP3, MMP9, uPA, VEGF and VEGF receptor ex- 
pression [36]. Ephrin receptor genes, EPH39B, A7, A5 and 
A10 genes encode the ephrin receptor with tyrosine kinase 
activity that affects tumor growth, invasiveness, angiogenesis, 
and metastasis [37]. 

There were no significant differences in the mean copy 
number of CDH1 and its transcriptional repressor genes 
(SNAI1, SNAI2, ZEB1, ZEB2, TWIST 1, etc.) between the 
clusters A and B, although these genes were reportedly 
associated with a poorly differentiated phenotype and 
poor clinical outcome [38]. However, these genes may 
still participate in UGC tumourigenesis through epigen- 
etic silencing [39]. 

We are now extending this study to to validate UGC- 
associated genes as indicated by aCGH by quantitative 
PCR and to correlate their genomic copy number to 
gene expression and prognosis. Thereafter, using quanti- 
tative PCR analyses instead of aCGH, similar analyses 
should be applied to a greater number of tumour cases 
with known outcomes. 

Conclusions 

Unsupervised cluster analyses of aCGH data of multiple 
samples from early and advanced UGCs have demon- 
strated that early UGCs, including LS+ types in which po- 
larity of cell proliferation and differentiation is well 
retained, have aggressive potential. Therefore, eradication 
of UGCs at early stages may thus contribute to better pa- 
tient survival. In addition, it was observed that the two 
UGC lineages, one derived from early SIG and the other 



from TUB, have different genomic copy-number alteration 
profiles, resulting in different sets of genes contributing to 
tumourigenesis. The latter lineage from TUB may be more 
dependent on genomic copy-number alterations and have 
a poorer outcome than UGCs derived from SIG. 
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